Right now we have a first time DEF CON speaker, Franz Peyer. He's going to speak to you about
exploiting music streaming with JavaScript. It's his first time at DEF CON. Please give
him a big round of applause. All right. I hope you guys are all awake.
So I'm Franz Peyer, programmer at Tactical Network Solutions, and I'm going to go over
exploiting music streaming with JavaScript. I'm going to go over exploiting music streaming
with JavaScript. So a couple of acknowledgments before I start. I'd like to thank Zachary
Cutlip and Craig Hefner for all the help and support. And my employer, Tactical Network
Solutions, for letting me learn about security without going to jail, which is great.
Special thanks to Ronald Jenkes, who is an independent artist who has given me permission
to use his music in his presentation so I don't get sued by the RIAA. Speaking of which,
I'd like to thank the EFF for helping me address issues with the DMCA and the CFAA.
However, the decision was made to not release the original tool which I had planned to.
That was a Google Chrome extension which would showcase all the different exploits and vulnerabilities
which I'm going to cover today. But I will be releasing an alternative tool which I'll
get into more detail later. I also want to state that the opinions and views expressed
here are mine and not my employer's. So what am I going to be talking about? Well,
I'm going to give you some background information of what my project is and what I've done so
you can have a context of, like, my approach and how I did it. So I'm going to start with
the music streaming basics so you guys have an understanding of how it works and the
limitations today and what you're going to be seeing. And then I'm going to go over my
security investigation process, so kind of taking you from the beginning to the end
of research to exploitation. And hopefully by the end of this you'll have a pretty good
grasp on how you can do this by yourself. And then I'll go over exploit demo, assuming
everything works out all right. And if I have any time afterwards, I'll talk about
my new alternative extension which I'm going to be releasing.
And I'll take questions at the end if we have any time.
So the end goal. Well, originally I had planned to release a Google Chrome extension which
would have all the different exports which I'm going to show you. And the way this would
work is that it would mimic the music player whenever possible, so whenever I was smart
enough to figure out how to reverse engineer the code and generate the requests the way
they did. Otherwise, I would just log whenever I saw some MP3 flying by and I look at the
syntax and I match, you know, that syntax and every time something that matches that
I can go get with that song. And so the end result is that you have something that sits
in the background and every time you listen to a song, you can download it.
But I'm not going to be releasing that. So what am I going to be releasing? Well, it's
an alternative which is ‑‑ it's not exactly the same. It's a forensics tool, not an exploitation
tool. And what it does is it duplicates requests that it sees flying by and it caches it in
your RAM. And this is helpful for like a hex dump analysis afterwards. So if you are like
into doing like malware analysis without wanting to do it, you can do it. But if you want to
do something to put it on your hard drive, you can do it entirely from your browser now.
And this is also helpful if you want to see exactly what's being loaded into your browser.
So here's the wall of shame, a bunch of different services which I found vulnerabilities in.
Some of these have made fixes. Some of them haven't. Most of them haven't. So we have
Pandora, Amini, SoundCloud, Grooveshark, Django, Playlist.com, and 8tracks. So we have
quite a big list. All right. So what is streaming? Well, Wikipedia defines it as a way to constantly
receive and present data while it's being delivered by a provider. So from a developer's
point of view, this means that you're going to be receiving data in a really long stream.
And as soon as you get the first piece, you can start processing it and displaying it
to the viewer. From a attacker's point of view, this means that you're going to constantly
be receiving data. And at the very end, you'll have everything you need to do to get the
data you need to reconstruct the file, whether it's a song or whatever. And it's only a matter
of capturing the data pieces. And once you do that, there are two major roadblocks that
prevent you from playing it back. You have reassembly of the pieces. Typically, if you
‑‑ in, like, the way the Internet works is when you send data, it's not always received
in the same order in which it was sent. And so this could be more difficult depending
on what type of protocol they use. And if you see any encryption, that's probably
going to be stopping you because usually if you want to break encryption, it's not
to get music, it's to get someone's password. So that's going to be your major roadblock.
So the protocols which I've seen, typically when you have some sort of desktop application
like Spotify or Pandora 1, they will use a custom TCP protocol. And this makes it incredibly
difficult to reassemble because, A, it's either not documented or, B, it's proprietary
and you don't know how it works. So this is probably going to stop you dead
in your tracks. However, I've noticed that some services like Last.fm use HTTP or HTTPS
because they don't want to write their own custom protocol. And ‑‑ but this is typically
what you see in some browser‑based applications. So the regular Pandora app, SoundCloud, these
guys are going to be using HTTP and HTTPS. And this is because they don't want to do
extra coding. The browser does it all for you. Why would you want to do it yourself?
And if you're an attacker or hacker or whatever, this means that you can use the browser, too.
Now you don't have to worry about reassembly and de‑encryption. So this is why I targeted
these because they're extremely easy to go after.
And there are two different types of streaming which I kind of named myself. There is static
streaming where you will have one URL per song. So you'll have to usually reference
it by a file name and you'll have to know the directory and everything. And this is
different from a dynamic one where you have one page and depending on what parameters
you send it, you get back a different file. So if you see here, we have a stream.php page
and depending on what key you send it, you get back a different file. So it's one‑to‑one
versus one‑to‑many. And this is important to keep track of when you're doing analysis.
So there are two major types of music players. The most common one is Flash. This is a majority
of the web players you'll see.
However, they may still use JavaScript. I've seen a lot of people be really lazy and they
will actually use JavaScript to pull back the data and then just pass it on to the Flash.
So all the Flash does is play it back. And this is typically because you have Flash libraries
that are made to play back music, but not necessarily customized to work with your interface.
So this is how they get around it. However, some services, you're going to have to decompile
them. And I don't know any Flash myself, so I skipped anything that did this. But it's
important to know that if you are successful at decompiling the Flash and you want to exploit
it using a Chrome extension with JavaScript, it runs in a separate environment due to security
issues. So if there's some kind of secret key baked into the Flash, you're not going
to get it from JavaScript.
And then the other one you'll see is HTML5, which is kind of experimental right now because
not all the browsers have full support for it yet. And you typically see this in mobile‑based
applications because Flash is losing support there. And this is entirely in JavaScript.
So no‑‑
No decompiling or anything, but it's minified most of the time, which means that it's obfuscated
and it's really hard to read.
So where's the vulnerability? Well, I already went over how the browser does all the major
work for you. So what do you have to do? Well, there are two ways of going about this.
I mentioned this earlier. You can copy the requests by kind of just telling based on
the syntax of the URL. And this is typically pretty easy. You look at one URL and you're
like, all right, there's a file name.
And this is the structure. And, you know, you can easily write a regular expression
to do this. However, this can be suspicious. If they're doing some kind of server‑side
logging and they see two identical requests coming in within milliseconds of each other,
they're probably like, huh, why ‑‑ this isn't normal activity. Why is this happening?
But I haven't really seen this being an issue in terms of any red flags being thrown up.
But this can be limiting. I found services where they have one‑time use tokens. So you use a token to
stream your music. And after it's been used, it's no longer valid. So by the time your second
request gets there, it's not valid. And you don't get anything back.
And the way you get around this is to generating the requests yourself. So this is going to
be a little bit more difficult. Sometimes you can tell based on the syntax of the URL
what variables are needed and how you get them, other times you have to reverse engineer
the code and figure out what they're doing. But if you are successful, you get past the
limitation there. And it's undetectable for you. And you won't be able to say why you're
when there's sessions. So if they have sessions, then it looks from the server side that two
requests are coming in from two different people in the same IP address and they just
happen to be listening to the same song. All right.
So how do you go about doing this? Well, the ‑‑ it's important to keep in mind
that you've got to do breadth before depth. You don't want to have to ‑‑ you don't
want to dig yourself in the first thing you see and waste two hours and figure out that's
the wrong thing. You want to take ‑‑ like keep track of all your possible options and
then take that ‑‑ take the path of least resistance.
And you want to remember breadth before depth ‑‑ okay, yeah, I did that. Okay. Once you ‑‑
once you keep that in mind, you want to locate the music file in the network traffic. So
you can do this in Chrome by opening up the developer console and then going to the network
tab and you can see all the traffic flying by. And there's going to be a lot of traffic,
so you want to filter.
And the reason why you want to do this is because typically when music is loaded in
a streaming service, especially an Internet radio service, the music isn't loaded when
the page is sent to you originally. Like with Pandora, the songs are actually loaded
after the page has been loaded. This is because they want to have time to look at your recommendations
and figure out what song to give you next. Also, they don't know how many songs they'll
be listening to before you go away.
So they have to load this after the fact. And this is done through Ajax, which is going
to be showing up as XHR traffic. And you can also sort by type, like looking for audio
files, because that's probably what you're going to be finding.
Once you find the actual request, you want to inspect any parameters in that request,
so headers, any kind of parameters that they send in the URL, stuff like that. Then you
want to find out where those values come from. And there are many different locations where
these values are going to come from.
The first place ‑‑ you want to do the easiest to the hardest. The first place that's
easiest is the page URL. Sometimes the song ID is in the URL of the page you're on, and
you can just use that to get the song. After that, you might want to look at the page source,
do a control F, look for the name of the parameter, you might be able to find it. Then
you might want to look at local storage, possibly cookies as well, because I've seen with services
like Grooveshark, they will ‑‑ if you have a blog, you might want to look at the
playlist or something, they will send the whole thing to you at once so you don't have
to keep making requests to find out what the next song is.
And at the very end, you want to look at JavaScript because that's going to be hard
to read and you're going to have to figure out what someone else's code is doing.
And when you have everything, you can attempt to replicate the request. So kind of based
on the syntax of the request you've seen as your example, take the parameters you have
and generate the same thing.
So first target is a meanie.
This is a really great first target. They're a Flash‑based service but they use JavaScript
to load. And they have almost no security. I was able to exploit these guys without looking
at any code.
So this is the page with the network traffic. I've circled the network tab. And at the very
bottom you can see that we have an audio slash MP3 file, which is actually what we're looking
for. So if you wanted to take the easy way out, you could actually right‑click this
and open it in a new tab.
And then download it that way. So this is the cheap way out.
Okay. However, I was trying to show you guys how to automate this with JavaScript. So we're
going to do more inspection. So looking at the actual request, we see there's actually
only one parameter and I took out all the other headers because those are the standard
headers that your browser sends. But we had this FID. And I'm going to go out on a hunch
and say FID stands for file ID because they typically name things like this. So I'm going
to say FID. So now we look for the FID. So the first place to look is going to be in
the URL. And sure enough, it's in the URL. Great. We have everything we needed. Now
how do I duplicate what they did? So you go and look back at the original request and
you can see that they actually have this weird subdomain thing going on. And I found
out with deduction that the first four characters of FID are the first four characters of that
subdomain in reverse order.
So it wasn't very difficult to figure that one out. And you can easily replicate this
using JavaScript, but I'm not allowed to show you guys any exploit code.
So our next target is Grooveshark. And this is quite a step up. I chose these guys because
they have HTML5 and I was kind of wondering how that would play out in terms of difficulty.
And they use several factors of authentication and the JavaScript is minified, so it's hard
to read, which makes it really for the faint of heart. It's not for the faint of heart.
So you want to make sure that you keep track of what you're doing the whole time. Know
what parameters you have, what you're looking for, what your next target is. And every
time you do something new, update your progress so you don't get lost because it can get confusing.
So I went ahead and I'm going to tell you guys that you're going to want to have a JavaScript
beautifier because it makes the glob on the left look like the glob on the right. And
while you still have characters like underscore, underscore P, at least you can have proper
spacing and you can read functions. So that's great.
And here this is ‑‑ I skipped the network analysis and this is the actual request itself.
And I've highlighted that they have ‑‑ that's the URL they have there and all they
have is a stream key as what you sent to them. So off the bat you think, hey, this is pretty
easy. I need one parameter, which is stream key. So that's what I'm going to look for.
So you also ‑‑ I looked through all the traffic to see what was coming.
And I found this more.php file. And there is a get stream key from song ID EX method,
which takes several parameters, but what this will do is return the stream key, which is
what we're looking for. So we need the session, the token, the UUID, and the song ID. And
the method I highlighted because that changes all the time, but we know what it is because
it says up there. So we only need four parameters. I say only because it actually is easier than
it seems. And while I was looking at more.php, I found
this get communication token method, which uses a secret key. And I like secrets. So
I'm going to keep this in mind. So what do we need? Well, from the very beginning,
we know that as soon as we get the stream key, we can get the song. And we know to
get the stream key, we need to call more.php with this get stream key from song ID EX method.
And we need to pass it these four parameters. And more.php has the secret key. So I'm going
to be interested. So I already looked through everything. It's time for the JavaScript.
And this is what I get. And in the very first line, you see this window.gs.tpl. And I'm
guessing GS stands for Groove Shark. So I'm like, cool, they're storing stuff in the
JavaScript environment. Let's see what they have. I find this window.gs.config, which
has the session ID. So we're like, all right, that's one down. What else is there?
There's actually this window.gs.models.q.models, which turns out to be the entire playlist
which you have saved in memory. And every single song in this playlist has an ID, which
is the song ID. So right off the bat, we were able to find two of the parameters which
we needed just by looking at the very first line of the JavaScript file. So not bad. But
we still need the rest of the parameters. So I do ‑‑ so this is kind of keeping
track of everything. We need the token of the user. We need the token of the user. We
need the UUID now. So I searched for the UUID, because it was easier. And sure enough,
I find a function that takes no parameters, which is good news, because now I don't have
to find any more parameters. I can just copy this function, and every time I need a new
UUID, I just call this function. So this is an easy copy and paste. So now we're left
with token. And this is where it gets a little bit more challenging. So I do a control F
for token. And I find this F.header.token, which turns out to be the token being put
into the header of the request that we saw. So looking at it, it takes ‑‑ there's
a bunch of stuff that we need. So going top down, like you should read code, there's this
R.last randomizer, which is equal to O. And remember what I said about functions that
take no parameters. You can just copy and paste them. And sure enough, that's the function
right there.
That's taken care of. We need this R.revtoken. So I do a control F for revtoken. And I find
revtoken, which is equal to N. N is equal to gooey flubber, which is the secret key
which they hope no one would find. Unfortunately, they probably shouldn't have put it right
on top of ‑‑ okay. So now we really need the current token, because if you recall,
method was just the method we were calling the URL with. And we had that documented.
So now we just need the current token. So I just searched for any instances of token
because I couldn't find any instances of current token. And, yeah, this is where the secret
key comes in. Because get communication token returns the token that we need for this request.
And so now we're on a hunt for the secret key. Which control F shows that it is the
hex MD5 of the session ID, which we found on the very first step. So we already have
everything that we need.
And to just recap, we needed the stream key, and we got that by finding these four variables
which were just in the JavaScript, and the secret key was needed to get the token. So
we have everything, and with this information you can generate the request. But I can't
show you an exploit code, so we're just going to go straight to a demo. All right. So this
is Django.com. Wait, I've got to switch.
All right. So this is Django.com. They are a very small music streaming service, and
it's like an internet radio station. So the first thing I'm going to do is open up the
developer tools and go to the network tab. And you want to do this before you actually
play the song, because if you do it after you play the song, you're not going to see
it in the network traffic because it's already been loaded. So it's important to keep that
in mind.
All right.
So it plays, which is good. And I'm going to filter based on XHR traffic, down here.
And at the very bottom we have this audio slash MPEG file, which turns out to be what
we're looking for. So I'm going to click on it and look for anything that we might need.
And it turns out that they ‑‑ this is a statically assigned ‑‑ static streaming
website.
So we just need this file name. And it's important to keep in mind, here we have this weird directory
thing going on, which is also the first six characters of the song ID. So if we need that,
we have it right there.
And because I don't really know where to look for this file ID, because it doesn't have
a specific parameter name in front of it, so I can't do a control F, I'm just going
to look at the other traffic that we filter on, because there's only four other requests,
so it's not going to take us very long.
And you can see here that this responds with a bunch of JavaScript files. JavaScript.
And it has a ‑‑ like, it has a song ID here, but it doesn't actually correspond
to the 08, 06 one that we have. So that's a dead end right there.
But if you look here, this page actually returns the URL without the file name, just
the whole URL. So this is our target. And if you look at the headers, they're going
to take quite a few parameters. So we have first time, which is equal to 1. I'm guessing
that means true. So that's going to be a binary flag. You could probably lie on that if you
want to. An SID, which I'm not really sure what it is. A version number, which is probably
going to be the same every time. SUW, which I'm not sure what it is either. And CB, which
I'm also not sure about. But at least now we know what names we're looking for.
And at this point, I actually noticed that ‑‑
I'm glad that Chrome has this, but they have this initiator column, which tells you exactly
what script and what line made this request. So if we click on this, it will actually take
us to this line, which, if you notice, this is minified JavaScript, so you're not going
to be reading this very easily. And I've went and gone ahead and beautified it, so
now we can actually inspect the code. So if you recall, the URL back here, which
was ‑‑ it was this streams URL. So we're going to look for any instance of streams.
And sure enough, it takes us straight to the line that creates the request. And we have
this underscore JM station ID and some parameters, which are set right above it. And so we can
see first time set it to one. We have this end address. We have this end address. We
have this SID, which is apparently the session ID. We have the version number here. And SUW
apparently stands for whether the sign‑up window is visible or not. And so we have
everything. And then CB here is apparently the date and time.
So we have everything we need. And I already went ahead and wrote a one‑line JavaScript,
which will generate this for us. So this spits out the URL that was done. And
it's generating the new song locations. So I'm going to copy that. And it turns out
they actually patched their service like three days before I came to DEF CON. And this is
freaking me out. Apparently they're doing some weird thing with checking your session,
but I found out to get around it is you can just refresh the ‑‑ you can just refresh
the radio station and then it will work. So here we have the next song that would be
playing. And if we actually just keep refreshing this,
it will give us a different song every time. So we can actually get every single song in
their music library.
But I did want to show you guys an exploit through my Chrome extension. So although I'm
not going to be releasing it, I can show you guys what it looks like.
So ‑‑
Okay.
Okay.
Okay.
So as you saw, there was a pop up and I clicked on it. And it takes us to my Chrome extension.
And you can select it and then hit download.
And it's right there.
So not very difficult. So that was the Chrome extension. But I did say that I was going
to be releasing this alternative tool. So what is this tool?
Well, I'm tentatively calling it browser shark until I get sued.
But I already bought the domain name, so I'm good.
Basically what it's going to do is if you recall earlier I mentioned I had a method
where I would copy any URL that matched the syntax of the song and then go retrieve it
myself.
Well, I decided that wouldn't it be cool if I could record all my traffic going through
and then cache that to the browser.
And what happened is now I can just go to Google and all my traffic will show up right
here.
And what you can do with this is you can actually analyze the hex of the request to make sure
that you're not getting any malware.
And it's nice enough to tell you what type of file it is.
So if they're lying to you, you can tell.
And like I said before, this would be really cool with forensics and stuff like that.
And I'm going to go ahead and do that.
I'm planning on doing more of coding so you can do a little bit more with the hex editor.
So this is a tool that I'm going to be releasing and I'll have the location to download it
at the end of my PowerPoint.
All right.
So things I learned.
Downloading music is inconvenient.
I found that after I had music, I didn't know what to do with it because managing all this
music was a pain.
So actually now I honestly just use Spotify because I don't like having to deal with files.
But services were fairly easy to exploit.
I think with all the different services which I listed at the very beginning, I found exploits
in them in three days total for all of them and the hardest one was Grooveshark which
took me a whole day.
Pandora actually was surprisingly easy.
And it was impossible to ‑‑ it's actually impossible to completely protect streaming.
Inherently.
And honestly, at some point, you're going to have your music on my computer and I own
my computer.
And even if you use encryption, you have to decrypt it so you can play it back and
at that point you can copy the files.
So inherently you can't protect streaming.
And some things you should know.
People have bad security.
This is a shocker.
And some people will patch their code.
Others will not.
This is the beast of security.
It's just the way it works.
And the same web traffic logging will work with video streaming services, too.
Some of them, not all of them.
People always ask me if Netflix will work.
No, Netflix will not work.
But if you go to some sketchy Chinese music streaming websites, I'm pretty sure this will
work as well.
But that's a topic for another day.
So I did a case study.
Originally the very first target I found was Last.fm, which if you guys aren't familiar
with is a British music streaming service.
And ‑‑.
And I found the vulnerability and I emailed them, I got no response.
I made this Chrome extension, I got no response.
But apparently they were able to fix it without my help.
So good on them.
And these are some things I noticed after they fixed it.
They secured it heavily.
They capped the bandwidth to match the playback speed.
So it's actually impossible if you wanted to download the whole music library, it wouldn't
be possible because you would have to wait as long as it would take to play back all
that music, which is years.
So that's a good way to go.
They prevent people from stealing all your music.
They also have one‑time use tokens.
Like I said earlier, once your first request is made, your second request is no longer
valid so you can't get the music.
And they also had it ‑‑ I also tried to do this really weird sketchy thing where
I would make sure that my requests would get there at almost the exact same time.
And what would happen was I would get a good ten seconds on my fake stream, but then it
would cut out because they only allow one stream open at a time.
So that was my first request.
That was pretty good.
And I couldn't exploit this.
And if you wanted to, it would take a huge amount of time.
It really wouldn't be worth it.
They have hundreds of lines to obfuscate a code and the bandwidth cap makes it so you
can't really feasibly take all the music.
So some mitigations.
Using current technology, the one‑time use tokens is definitely the best way to prevent
people from, like I showed you before, right‑clicking and opening a new tab and then saving it because
the second request won't work.
I've also seen people use RTMP eStreams.
Okay.
So this is Adobe's protocol which they use and the E stands for encrypted.
I just want everyone to know that just because your protocol has an E in it doesn't mean
it's encrypted.
I've seen many services which have used regular RTMP non‑encrypted traffic and just put
the E as the protocol and that doesn't make it secure.
And also returning the song in pieces really helps as well.
Although ‑‑ so SoundCloud actually did this a couple days before DEF CON, but they
named all the pieces in numerical order, so that doesn't make it any more difficult
for me to put them back together.
And for future‑proofing, you can take a look at the HTML5 audio tag with DRM support
and these guys from Virginia Tech wrote a paper on it.
I haven't really looked much into it because I know inherently that nothing is going to
work.
But if someone is interested, that's there.
And so these are the references.
I actually have uploaded a few of them.
The browser shark thing to the Google Play store, whatever.
So that's the long URL.
I made a bit.ly if you trust me enough to click on it.
I assure you it's the same link.
I'm also putting this project on GitHub because I want it to be open source.
It's empty right now, but I'm going to be putting stuff on there in the next couple
days because I don't trust DEF CON Wi‑Fi.
And then there's my blog.
I sometimes put interesting stuff on there, sometimes don't, no guarantees.
And that's the paper and the JavaScript beautifier.
So there's my contact information if you want to talk to me.
And I'll take any questions now if anyone has any.
Yes?
How did you deal with renaming all those files?
Okay.
So the question is how I dealt with renaming.
In the exploit extension, what I did was I wrote a script which would hook onto the page
and I would use jQuery to take the file name and then the artist from the page itself because
they provide that information so the user knows.
Okay.
So I actually spent a good, like, several hours trying to synchronize all the songs
together, but that's how I did it.
Yes?
Why aren't you releasing your original plugin?
Because I don't like losing all my money to lawsuits.
Apparently the way it's written is I can get sued for $20,000 per count of trafficking
and they have millions of songs which would turn out to billions of dollars, which I don't
have.
Yes?
Have you tried using the style type?
Have you used the style type?
Yes.
I tried doing this with Spotify.
This guy in I think Norway or something did the same thing I was doing but happened to
release the code for it several months before I gave this talk, so they fixed it.
So it does not work with Spotify.
Yes?
Did you find that the songs you downloaded have the correct metadata on them?
It honestly depends on the service.
I think Pandora did a better job of that.
Some of them allow users to upload their own songs and you will get, like, really weird
cover ‑‑ like, ads as your cover art.
So it just depends on the service.
Yes?
Are these files still being watermarked?
There is no DRM in any of these.
These are just, like, straight ‑‑ like, you can play ‑‑ as you saw, I was playing
it back in my music player when I downloaded it.
So there is no DRM on any of these songs.
Okay.
Thank you.
Other questions?
All right.
Cool.
